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Abstract 

In this paper we study a dynamic resource allocation problem which we call the 
stochastic k-server problem. In this problem, requests for some service to be performed 
appear at various locations over time, and we have a collection of k mobile servers which 
are capable of servicing these requests. When servicing a request, we incur a cost equal 
to the distance traveled by the dispatched server. The goal is to find a strategy for 
choosing which server to dispatch to each incoming request which keeps the average 
service cost as small as possible. 

In the model considered in this paper, the locations of service requests are drawn 
according to an IID random process. We show that, given a statistical description 
of this process, we can compute a simple decentralized state-feedback policy which 
achieves an average cost within a factor of two of the cost achieved by an optimal state- 
feedback policy. In addition, we demonstrate similar results for several extensions of 
the basic stochastic fc-server problem. 

1 Introduction 

Recently, there has been great interest in the study of coordination strategies for teams 
of Unmanned Aerial Vehicles (UAVs). In particular, many researchers have focused on 
methods for designing efficient mission plans, under which a series of tasks can be carried 
out by a team of vehicles. A common high-level formulation of this type of problem 
consists of a series of waypoints that must be visited by the vehicles, with the goal of 
designing a strategy for visiting each of the waypoints in a manner which minimizes 
some measure of the overall travel time. When the set of locations to visit is known 
ahead of time, it is possible to plan the mission offline, and each vehicle can perform 
its own tasks without requiring communication among the vehicles [5115]. In a dynamic 
environment, where waypoints may appear as the system is operation, the mission cannot 
be planned entirely ahead of time. Such a formulation is considered in [5]. However, due to 
limited computational and communication resources, it is generally not feasible to consider 
coordination strategies which require complete communication among the vehicles during 
system operation. The general problem considered in this paper is motivated by the 
problem of multi- vehicle coordination in a dynamic environment. 
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The well known fc-server problem is a natural model for dynamic task assignment prob- 
lems with distance-based costs. Roughly speaking, the fc-server problem is as follows. We 
are given a set of locations, and requests for services to be performed originate sequen- 
tially from these locations. We have a collection of fc mobile servers which are capable of 
servicing these requests. At each point in time we must choose a server to serve the cur- 
rent request, and we incur a cost equal to the distance traveled by the dispatched server. 
The goal is to find a strategy for choosing which servers to dispatch to each incoming 
request which keeps the average service cost as small as possible. 

The fc-server problem has been well studied for the problem formulation where the 
demand sequence may be arbitrary. Most of the literature on the fc-server problem has 
focused on the competitive analysis of online algorithms. An online algorithm is a strategy 
which makes decisions based only on the knowledge of present and past requests, and 
competitive analysis seeks to compare the performance of specific online algorithms with 
the performance of an optimal strategy which knows the entire request sequence. The best 
known results for the fc-server problem show that a particular online algorithm (which 
requires intensive computation to implement) achieves an overall cost which is essentially 
within a factor of 2k — 1 of optimal The reader is referred to J3J for a survey on the 
fc-server problem, online algorithms and competitive analysis. 

In this paper we consider a variation of the fc-server problem, where the locations of 
the service requests are drawn at random according to an IID random process. With this 
stronger assumption on the demand sequence, it is possible to show that a simple, prac- 
tical strategy can achieve performance comparable to an optimal state feedback strategy. 
Specifically, we show that, given a statistical description of the request sequence, we can 
compute a simple decentralized state-feedback policy achieves an overall cost within a 
factor of two of the cost achievable by an optimal state-feedback policy. A decentralized 
policy has the property that, once the policy is determined, no communication between 
the servers is required for its implementation. In addition, we demonstrate similar results 
for several extensions of the basic stochastic fc-server problem. 

2 Problem formulation 

In this section we give a precise formulation of the stochastic fc-server problem. In our 
formulation, the servers are positioned and the requests originate at points in some finite 
set S. The set £ is equipped with a metric d : S x S — > R + . At each time step 
t G Z + , service at some point x(t) G S is requested and the fc servers reside at the points 
x\{t), . . . , Xk{t) G S. Exactly one server must be chosen to service the request at x(t). If 
server u(t) is chosen, then a cost of d(x u ( t ) (t), x(tj) is incurred and server u(t) is relocated 
to the point x(t). That is, x u a)(t + 1) = x{t) and Xi(t + 1) = Xi(t) for all 1 < i < fc 
such that i ^ u(t). The next service request x(t + 1) is then randomly chosen. In our 
model, each x{t) is drawn according to the probability mass function p : S — > [0,1], and is 
independent of x(r) for all r ^ t. The goal of the problem is to determine a strategy for 
assigning servers to service requests which keeps the average cost incurred in each time 
step as small as possible. This is illustrated in Figure 1. 

The problem described in the previous paragraph can be formulated as a finite state 
Markov decision process with average cost criteria (see, for example, |SJ). In general, finite 
state Markov decision processes have a finite state space X, and a finite set of actions 
U available at each time step. Taking action u G IA when in state y G X incurs a cost 
r(y, u). After taking action u in state y, the system state in the next time period is x G X 
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Figure 1: Gray squares represent server locations and the black square represents the 
location of the service request. One server will move to the request location and incur a 
cost equal to the distance traveled. 



with probability Pr(X(t + 1) = x \ X{t) = y, U(t) = u). 

A static state-feedback policy is a decision rule in which each u(t) is chosen according 
to a function \i : X — > U of the current state x{t). The steady-state average per-period 
cost under the policy [i is 



We denote a policy which minimizes this cost by /z*. 

The obvious formulation of the stochastic fc-server problem as a Markov decision process 
has the state at time t given by x(t) = (x(t),Xi (t), . . . , Xk(t)), the current service request 
location together with the set of current server locations. The state space is as a subset 
of S k+1 since we may exclude, without loss of generality, all states which have more than 
one server assigned to a particular location. The action u(t) taken at time t is the index 
of the chosen server, and the action space is U = {1, . . . , k}. The cost incurred at time t 
is r(x(i),u(t)) = d(x u ( t -)(i),x(t)), the distance from the dispatched server to the current 
service request. Under a static state-feedback policy, the state evolves according to a 
Markov chain since x is an IID random process and for each t, x\(t), . . . , Xk(t) depends 
only on the previous state. 

Although algorithms exist for determining an optimal state-feedback policy for aver- 
age cost Markov decision processes, they are generally not practical for this problem. 
One reason is that, under the formulation above, the system has |<S| fc+1 discrete states. 
Numerical computation of an optimal policy will be intractable even for relatively small 
values of |<S| and k. Also, even if the optimal policy could be computed, this policy may 
not lend itself to practical implementation. In particular, the optimal policy may be 
structured so that the decision u(t) must be made based on the knowledge of all server 
locations at time t. This means that all servers would be required to communicate their 



J{fx,x(0)) = 
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current locations to all other servers before each decision could be made. In the next 
section, we will show that a fairly simple decentralized strategy can achieve an average 
per-period cost within a factor of two of an optimal centralized strategy. 



3 Main result 

In this section we will consider decentralized policies for the A:-server problem. After 
introducing decentralized policies, we will show that there is a decentralized policy that 
can achieve performance close to that of an optimal policy. 

In a general state feedback policy, the decision of which server to dispatch to a request 
depends on the location of the request as well as the current location of all servers. 
In contrast, a decentralized policy is a policy in which each server makes a decision to 
serve the current request without knowledge of the locations of other servers. Given that 
one and only one server must respond to each request, it is necessary that decentralized 
policies have a special 'partition' structure. That is, decentralized policies partition the 
set S into k disjoint sets <Si, .. . ,<Sfc, and server i serves location x if and only if x € Si. 
This is illustrated in Figure 2. 



Figure 2: Locations are separated into disjoint partitions. Servers only serve locations in 
their partition. 

It turns out that there is always a decentralized policy for any instance of the stochastic 
fc-server problem that can achieve an average cost comparable to the optimal centralized 
cost. This policy, which we will call /i^, is constructed as follows: 

1. Compute the ml, . . . , m* k which minimize 
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2. Construct the disjoint partitions S\, . . . ,Sk, where 



Si = {s € S | d{m*, s) < d(m*, s) for j = 1, . . . , k}. 
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3. Let /id(x) = i if x £ Si. 

Performance of this policy relative to an optimal policy is characterized in the following 
theorem, which is the main result of this paper. 

Theorem 1. The cost of the decentralized policy [id satisfies 
for all x(Q) £ X. 

In order to prove Theorem^ we will employ a result which allows one to generate perfor- 
mance bounds for general Markov decision processes. This result is proven in Pj for the 
case of general measurable state spaces, and is presented here for the finite state space 
case. 

Lemma 2. Consider a finite state Markov decision process with average cost criteria. 
For any state feedback policy fj, : X — ► U and any function hxj : X — ► M, 

J(jjl, x(0)) < sup{r(a;, fx(x)) + A v (x)} 

for all x(0) £ X, where 

Au(x)=E[hu(X(t + 1)) \X(t)=x, U{t)=fi(x)} - hv{x). 
Moreover, for any function : X — > U , 

J(n*,x(Q))> inf {r(x, u) + A L (x, it)} 

x£X,u&J 

for all x(0) £ X, where 

A L (x,u)=E[h L {X{t + l)) \X(t)=x,U(t)=u]-h L (x). 

Given the result in Lemma [21 we can now prove Theorem^ 

Proof of Theorem ^ First we will find a lower bound on J(/z*, x(0)) using LemmaEl 
with 

= mm{d(xi, x)}. 

i&A 

For this choice of /il, 

Al{x, u) = y p(s) min{d(xi(t + 1), s)} - xmn{d(xi, x)}, 

i&A i&A 



x if i = u 
xa otherwise 



where 

Xi{t+l) - 

Since d(x u ,x) > Hl{x) for all a; £ A" and u £ U, by Lemma [21 we have 



J(/i*,x(0)) > min < } p(s)m.m{d(m,i,s)}> (1) 

meS k L — ' i&A 
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for all x(0) G X. 

Let m* denote the minimizing m in 0J. Recall that the decentralized policy [id divides 
the set S into disjoint partitions S\, . . . ,Sk, where 

Si = {s G S | d(m*, s) < d(rrij, s) for j = 1, . . . , fc}. 

We will find an upper bound on J{[id, x(0)) using LcmmaElwith 

k 

hu (x) = 2 min{d(m* , x)} + \ d(xi,m*). 



For this choice of hu, 

r(x,[i d {x)) + Au(x) = 

2 p(s) mm{d(m* , s)} + 

d{x^ d(x) ,x)-d(x, ml d{x) ) - d{x^ d{x) , ml i{x) ). 

Since d is a metric, 

d(x^ d(x) ,x) < d{x^ d{x) ,ml d(x) ) + d{ml d{x) ,x), 

and therefore 

J(Hd,x{0)) < 2 Vp(s)min{d(ra*,s)} 

ses 

< 2J(n*,x(0)). 



4 Computing decentralized policies 

It was shown in the last section that finding a decentralized policy which achieves an av- 
erage cost within a factor of two of optimal reduces to finding the ml , . . . , m£ minimizing 

p(s) min{d(mi, s)}. 

In other words, a decentralized policy for our dynamic problem can be determined by 
solving a static combinatorial optimization problem. This static problem has been well 
studied, and is known as the k-median problem. 

The number of possible solutions to the fc-median problem is ^ ^ . Unfortunately, 

there are no known algorithms for finding an optimal solution with computational re- 
quirements that scale well with k. However, much study has been devoted to efficient 
approximation algorithms for this problem. In this section we will show that the result of 
the previous section can be combined with known results on approximation algorithms for 
the fc-median problem to obtain efficient algorithms for computing decentralized policies 
for the stochastic fc-server problem. 
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Suppose mi, . . . , TTife is a suboptimal solution to the k- median problem. Let pd be the 
decentralized policy constructed with the disjoint partitions Si,... ,Sk, where 

Si = {s S S | d(fhi, s) < d(fhj, s) for j = 1, . . . , k}. 

The following lemma relates the performance of the policy pd to the quality of the sub- 
optimal fc-median solution fhi, . . . , fh^. 

Lemma 3. Suppose the suboptimal k-median solution mi, . . . ,fhk satisfies 
^p(s) min{«i(m. i , s)} < p^p(s) rmn{d(m*, s)} 

for some p > 1. Then 

J{Jid,x(0)) <2pJ(p*,x(0)) 

for all x(0) G X. 

Proof. We can find an upper bound on j(p,d,x(0)^J using Lcmma|21 with 

k 

hjj{x) = 2 min{d(fhi, x)} + > d(xi,fhi). 

i&A 

4=1 

Proceeding exactly as in the proof of Theorem we obtain 

J(p d ,x(0)) < 2 p(s) min{d(mi , s) } 

< 2p^2p(s)imn{d(m*,s)} 

sG<S 1 

< 2pJ(ji.,x(0)). 



In other words, an approximation algorithm which produces factor p suboptimal so- 
lutions to the fc-median problem leads to a method for computing factor 2p suboptimal 
decentralized policies for the stochastic fc-server problem. One particularly attractive ap- 
proximation algorithm for the fc-median problem is the local search heuristic of [Q. This 
algorithm is particularly simple to implement and capable of achieving an approximation 
ratio of 3 + e for any e > 0, where there is a tradeoff between computational requirements 
and approximation ratio. 

5 Extensions 

In this section we will discuss several extensions of the basic stochastic fc-server problem 
and show that results analogous to Theorem ^ can be established. 

5.1 Server-dependent processing costs 

The first extension we consider generalizes the fc-server model to the case where the servers 
are not equal in their processing capabilities. In particular, we model the cost of serving 
a job at location x by server u at location 

r(x,u) = d u (x u ,x) + c u (x). 
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The amount of resources consumed (time, fuel, etc.) by moving from location x u to 
location x depends on the server, and is modeled by the metric d u if server u is chosen. 
Once the server arrives at the service location, an additional cost of c u (x) > is incurred 
when processing the job at location x by server u. 

As before, decentralized policies partition the state space and assign exactly one server 
to each partition. We have the following theorem regarding decentralized policies for the 
case of server-dependent processing costs. 

Theorem 4. For the problem with server- dependent processing costs, there exists a de- 
centralized policy Hd such that 

J{(i d ,x(0)) < 2 JO*, , a;(0)) 

for all x(0) E X. 

Proof. Similar to the proof of Theorem ^ we will find a lower bound on J(/x») using 
Lemma |21 with 

h L (x) = imn{di(xi,x) + c l {x)}. 
For this choice of h^, we obtain the lower bound 

J(/z*,x(0))> min < } p(s) min{<2j(mj, s)+a(s)} > (2) 
UeS J 

for all x(Q) £ X . Note that, unlike the proof of Theorem^] the order in which mi, . . . , m*, 
are indexed effects the lower bound in 

Let m* denote the minimizing m in (J5J). The decentralized policy \Xd divides the set <S 
into disjoint partitions Si, . . . , Sk where 

Sj = {s e S | d,(m*, s) + Cj(s) < d.j(m*,s) + Cj(s) Vj}. 

We will find an upper bound on \id using Lemma [21 with 

k 

hu(x) = 2 min{dj(m|, x) + Ci(x)} + > di(xi,m*). 

i&A — ' 

i=l 

Denoting Ud = f^d{x), we have 

r(x,u d ) + Au(x) = 

2 y^Pis) min{di(ra?, s) + a(s)} - c Ud {x) + 
ses 

d Ud { x u d , ic) — <^«a m u d ) ~ du d ( x u d i m Ud )- 

Since c£ Ud is a metric, 

(x Ud ,x) < d„ d (x Ud , rn Ud ) + d Ud { m Ud ) a; ) • 

Since c Md (x) > 0, we have 

J(/j, d ,x(0)) < 2 y^p(s)rmn{di(m*,s) +Cj(s)} 

< 2J( M *,cc(0)). 
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5.2 Multiple requests per period 

Next we consider the case when some fixed number n < k of requests is generated and 
must be served in each time step. Specifically, at time step t, service is requested at some 
set of points x±(t), . . . ,x n (t) G S, and exactly n servers must be chosen to service these 
requests. Here the state at time t is given by x(t) = (xi(t), . . . , Xk(t), x~i(t), . . . ,x n (t)). 
Let Uj(t) denote the index of the server chosen to service request j. For this case the 
action at time t is u(t) = (ui(t), . . . , u n {t)) and the action space is 

U = {u G {1, . . . , k} n | m ^ uj for i ^ j}. 

At time t, a cost of X)J=i d(x Uj ( t )(t),Xj(t)) is incurred. Server Uj(t) is then relocated to 
the point Xj (t) , and the next set of requests is drawn according to some probability mass 
function p : S n — > [0, 1]. 

Decentralized policies for this case are a natural extension of the partition policies for 
the single request case. We will analyze the performance of the decentralized policy fid 
which is constructed as follows. 



1. Find the m*, . . . , m* k minimizing 



seS" I 3=1 



2. Let 



Hd{x) = argmin I > d(m* Ui , xj) 

u&A 



3 = 1 



In this policy, the server at point Xi is always associated with the median at point m*. 
When a new batch of requests arrives, each request is matched to one of the medians. 
No two requests are matched to the same median. If the request at point % is matched 
to the median at point to*, then this request is served by the server at point Xi. Note 
that, unlike the single request case, servers may move between partitions associated with 
several medians. This is because multiple requests may appear in the same partition, and 
must be served by multiple servers. 

Analysis of this case is much like that of the single request case, and is presented in 
the following theorem. 

Theorem 5. The cost of the decentralized policy pd satisfies 

J(pd,x(P)) < 2J(fi*,x(0)) 

for all x(0) E X. 



Proof. The lower bound on J(fi*,x(0)) is is determined using Lemma0with 



h L (x) = mm < d(x u , ; 

3 = 1 
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For this choice of Hl, we obtain 

f f " 

J(/x*,a;(0)) > min < > p(s) min < > d(m u Sj 

for all x(0) € A". 

The upper bound on J(/J,d, x(0)) is determined using Lemma [21 with 

( n ) fe 

hu{x) = 2 mm ^ ^ d{m* u . ,%) } + ^ d(a5j,mj). 



Let 



For this choice of hy, 



x-(t+l) = { * j iii = Vd{x) j 
^ ' \ Xi otherwise 



r(x, n d {x)) + A v (x, n d {x)) = 2 ^ p(s) ^ d ( m M«)j » s j 

sGS™ \i=i 

3=1 i=l 
fe / n 

»=i \j=i 

= 2 EK«)(ErfK d(8) ,, s , 

n n 

i=i j'=i 
n / n 

i=i \j=i 

Since d is a metric, 

d(2W ^ ^Wj.^Wj) + d ( m M d (x)^%) 
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for all j. Therefore, 



inf {r(x, Hd(x)) + Au(x)} 



< 




< 



2J(n*,x(0)). 



It is worth noting that for the two extensions presented in this section, computing 
decentralized policies requires solving generalizations of the fc-median problem. Whether 
any of the existing approximation algorithms for the fc-mcdian problem can be extended 
to these generalizations is not clear, and is a topic for further research. 

6 Conclusion 

In this paper we presented the stochastic k-server problem, and showed that a simple 
decentralized state-feedback policy achieves an average cost within a factor of two of the 
cost achieved by an optimal state-feedback policy. These results were then extended to 
several variations of the basic stochastic fc-server problem. 

In this paper, we presented a formulation where the set of possible locations to be 
served is finite. We have focused on this formulation because low complexity algorithms 
for computing decentralized policies exist in this case. In fact, it is straightforward to use 
the results of P] to show that the results of this paper hold in infinite bounded metric 
spaces as well. 
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